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Background:  The  sea  urchin  represents  an  important  research  model  for  understanding  a  wide 
variety  of  phenomena,  including  molecular  biology,  evolution,  and  biomaterial  formation.  With 
regard  to  the  latter,  the  species  Stronglocentrotus  purpuratus  (purple  sea  urchin)  and  its  calcium 
carbonate-containing  adult  spine  and  embryonic  spicule  skeletal  elements  have  provided  insights 
into  the  formation,  stabilization,  and  crystalline  transformation  of  amorphous  minerals. 1-10  This 
mineralization  process  is  highly  interesting  and  in  the  sea  urchin  spicule  it  is  now  believed  that 
hydrated  and  dehydrated  variants  of  amorphous  calcium  carbonate  (ACC)  are  the  true  precursors 
to  crystalline  calcite.1  Even  more  interesting  is  the  fact  that  in  certain  regions  of  the  spicule  the 
hydrated  ACC  variant  persists  as  stabilized  nanoparticles  alongside  calcite  crystals.1  Recent 
studies  have  shown  that  the  spicule  matrix  (SM)  proteins  of  S.  purpuratus  embryonic  spicules 
are  physically  associated  with  ACC  deposits  and  may  be  involved  in  ACC  localization  and  its 
eventual  transformation.1'10  Interestingly,  the  entrapment  of  these  proteins  within  the  mineral 
phase  is  one  of  the  major  contributors  to  the  fracture  resistance  of  these  skeletal  elements.6'9 
Hence,  there  is  a  great  interest  in  understanding  the  functional  role(s)  of  these  proteins  within  the 
spiculogenesis  process,  not  only  from  a  biological  standpoint,  but  also  from  a  materials 
perspective  that  is  relevant  to  the  mission  of  the  US  Army  Research  Office. 


With  the  successful  sequencing  of  the  S.  purpuratus  genome,11'13  we  now  know  that  there 
are  16  unique  SM  biomineral-associated  genes.  Via  RNA  splicing  pathways,  these  genes  code 
for  >  40  expressed  matrix  proteins11'14  that  are  secreted  by  the  primary  mesenchyme  cells  into  a 
membrane-bound  mineralization  space  where  the  spicule  will  form.5’81213  All  expressed  SM 
proteins  feature  a  canonical  structure  consisting  of  a  N -terminal  leader  sequence  and  a  C-type 
lectin-like  domain  (CTLL), 11-14  and  in  10  SM  proteins  there  also  exists  a  C-terminal  Gin,  Pro, 
Gly-rich  repetitive  domain.11"14  It  is  known  that  the  SM  proteins  assemble  to  form  a  concentric 

I  10  20  30  40  50  60  70  80 

QDCPAYYVRS  QSGQSCYRYF  NMRVPYRMAS  EFCEMVTPCG  NGPAKMGALA  SVSSPGENME  IYQLVAGFSQ  DNQMENEVWL 

81  90  100  NO  120  130  140  150  160 

GWNSQSPFFW  EDGTPAYPNG  FAAFSSSPAS  PPRPGMPPTR  SWPVNPQNPM  SGPPGRAPVM  KRQNPPVRPG  QGGRQIPQGV 

161  170  180  190  200  210  220  230  235 

GPQWEAVEVT  AMRAFVCEVP  AGRNIPIGQQ  PGMGQGGFGN  QQPGMGGRQP  GFGNQPGMGG  RQPGFGNQPG  MGGRQ 

236  250  260  270  280  290  300  306 

PGWGN  QPGVGGRQPG  MGGQQPGWGN  QPGVGGRQPG  MGGQPGVGGR  QPGFGNQPGM  VDNNQAWWTTTRLGNQ 

307  320  330  340  350  360  370  380 

PGVG  GRQPGMGGQP  GVGGRQPGVG  GRQPGFGNQP  GVGGRQPGMG  GQQPGMGGQP  GVGGRQPGMG  GRPPGFGNQP 

381  390  400  410  420  428 

GVGGRQPGMG  GQQ  RFNRPRM  LQEADALA 

Figure  1:  Primary  sequence  of  mature  S  .purpuratus  spicule  matrix  protein  SM50.  Color  coding  of  sequence  regions:  Green  = 
CTLL  domain;  Red  =  Gin,  Pro,  Gly-rich  repeat  region;  Blue  =  Pro,  Asn-rich  region. 

matrix  that  courses  throughout  the  mineral  phase  of  the  spicule,  with  some  SM  proteins  localized 
at  the  growing  mineralization  front  and  others  participating  in  the  secretion  of  spicule 
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components.914"16  Thus,  there  are  specific  functional  and  regional  roles  for  each  SM  protein  and 
there  is  no  doubt  that  spicule  matrix  assembly  is  a  critical  event  in  the  mineralization  process. 
However,  very  little  is  known  regarding  the  spicule  matrix  protein  assembly  process  and  its 
organization,  or,  the  involvement  of  SM  proteins  with  ACC  deposits. 

To  derive  a  better  understanding  of  the  overall  spicule  matrix  assembly  process,  one  must 
first  understand  the  assembly  behavior  of  individual  SM  proteins  and  then  progress  to  mixed  SM 
systems  where  heterogeneous  protein  -  protein  interactions  can  be  evaluated.  At  present,  the 
best  candidate  for  individual  study  is  SM50,  a  44.5  kDa  basic  (pi  =  10.73)  single  polypeptide 


wavelength  (nm)  v/v%TFE 

Figure  2:  (A)  DLS-determined  hydrodynamic  radii  (Rh)  for  164  pM  apo-SM50  oligomers  at  pH  9.76  in  10  mM  NaHCCb  / 
Na2CC>3  buffer.  For  comparison,  we  present  Rh  values  obtained  in  the  same  buffer/pH  solutions  for  bovine  erythrocyte  carbonic 
anhydrase  II  (Sigma/Aldrich,  USA),  A.  rigida  mollusk  shell  prismatic  protein  Asprich  “3”,  and  H.  rufescens  mollusk  shell  nacre 
protein,  AP7.  (B)  Far  UV  circular  dichroism  spectrum  of  6  pM  SM50,  pH  9.76  in  10  mM  NaHCCb  /  Na^COj  buffer,  as  a 
function  of  2,2,2-trifluoroethanol  (TFE)  content  (0,  10,  20  30,  50,  75%  v/v).  (C)  Fraction  %  of  secondary  structures  calculated  by 
TFE  titration  of  6  pM  SM50,  pH  9.76  in  10  mM  NaHCOs  /  Na2C03  buffer. 

(Figure  1)  that  is  the  most  abundant  protein  in  the  embryonic  spicule,  the  mature  adult  spine,1,912 
and  the  tooth  and  test  skeletal  elements  of  this  sea  urchin. 12-16  This  protein  is  a  member  of  a 
subfamily  that  includes  SM37,  SM32,  SM29,  PM27  and  three  predicted  SM29-related  spicule 
proteins.11-14  In  the  spicule,  SM50  is  preferentially  localized  along  the  interior  of  the  spicule 
sheath  at  the  periphery  of  the  mineral  phase,1'6  9'1516  and  thus  it  is  believed  to  play  a  major  role  in 
ACC  stabilization  and  transformation.1  20-22  The  primary  structure  of  SM50  features  the  canonical 
CTLL  domain  within  the  N-terminal  portion  of  the  protein. 11-14  Intriguingly,  the  C-terminal 
domain  contains  two  repetitive  domains  and  a  charged  C-terminus.  The  first  is  a  203  AA  Gin, 
Pro,  Gly-rich  consensus  repeat  sequence,  -QPG(F/M/W)G(N/G)QPG(V/M)GG(R/Q)-, 12,14,17 


Figure  3:  Representative  AFM  images  of 
rSM50  oligomers  (6  pM)  forming  on  mica 
substrates  at  pH  9.76  in  10  mM  NaHCCh  / 

Na2C03  buffer. 

with  the  most  common  variants 
-GVGGR-  and  -GMGGQ- 
homologous  to  both  elastin  and 
spider  dragline  silk  protein 
elastomeric  repeats. 18,19  The 
second  is  a  conformationally 
labile  20  AA  Pro,  Asn-rich  repeat19  that  is  upstream  of  the  Gin,  Pro,  Gly-rich  repeat.12,14,17-19 
Unfortunately,  we  know  very  little  about  the  function(s)  of  the  CTLL  and  repetitive  domains,  the 
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aggregation  or  assembly  properties  of  SM50,  or  how  protein  self-assembly  might  facilitate  ACC 
formation  and  stabilization. 


Research:  To  resolve  this,  we  initiated  in  vitro  studies  of  apo-SM50  self-assembly  using  a 
recombinant  form  of  SM50  (rSM50)  and  solution  conditions  approximating  those  of  in  vitro  pre- 
nucleation  cluster  mineralization  assays.20  22  Under  these  conditions  we  find  that  apo-rSM50  is  an 

Figure  4:  600  MHz  1-H  NMR 
TOCSY  spectra  of  116  pM  rSM50 
in  UDDW,  pH  7.5,  (A)  NH  -  CHa 
fingerprint  region;  (B)  Sidechain 
aliphatic  region.  Tentative 
assignments  of  amino  acid  spins 
systems  are  indicated  on  the  plots. 

intrinsically  disordered 
protein  (Figure  2)  that  is 
fold-inducible  and 
assembles  to  form 
disordered  supramolecular 
complexes  that  possess  a 
high  degree  of 

dimensional  heterogeneity  <o2(i-H.pPm) 

(Figures  2A,  3).  These  protein  assemblies  are  “plastic”,  i.e.,  they  are  highly  dynamic  with 
evidence  of  backbone  and  sidechain  motion  emanating  from  the  repetitive  Gin,  Pro,  Gly  and  Pro, 
Asn  repeat  domains  of  the  C-terminal  region  (Figure  4).  Interestingly,  the  N-terminal  CTLL 
region  does  not  exhibit  this  phenomenon.  We  note  that  dynamic,  labile  behavior  is  also  common 
to  other  mineral-stabilizing  biomineralization  proteins  assemblies23'29  and  to  disordered  polymer- 
induced  liquid  precursor  (PILP)  phases  that  stabilize  amorphous  minerals  in  vitro  and  regulate 


Pro.  Gly,  Gin  repeat 


GLOBPLOT 

DISOPRED 

IUP 

zipperDB 

FOLD_AMYLOID 

AGGRESCAN 


Figure  5:  Graphical  representations 
of  predicted  locations  of  (A)  intrinsic 
*  disorder  (GLOBPLOT,  DISOPRED, 
''IUP  algorithms,  blue  color)  and  (B) 
amyloid-like  aggregation  prone 
(zipperDB,  FOLD-AMYLOID, 
AGGRESCAN  algorithms  red  color) 
_  regions  of  the  SM50  mature 

sequence.  The  sequence  locations  of 
B  the  CTLL  (green)  and  Gin,  Pro,  Gly 
repetitive  (yellow)  domains  are 
shown  as  overlays. 
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their  transformation  into 
-  crystalline  solids.30,31  Using 
bioinformatics,  we  confirm 
that  the  C-terminal  Gin, 

Pro,  Gly  repetitive  domain  is  the  primary  source  of  intrinsic  disorder23,32'38  in  the  SM50  sequence 
(Figure  5).  Interestingly,  bioinformatics  predictions  indicate  that  the  N-terminal  CTLL  region 
of  SM50  possesses  a  significant  level  of  amyloid-like  cross-beta  strand  regions  (Figure  6),23,39'43 
which  are  important  for  protein-protein  assembly  and  this  implicates  the  CTLL  domain  in  SM50 
supramolecular  assembly.  Thus,  SM50  is  a  disordered,  aggregation-prone  protein  that  forms 
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highly  labile,  dynamic  PILP-like  protein  assemblies  and  these  features  may  facilitate  ACC 
localization  in  the  embryonic  spicule. 

Conclusions:  One  of  the  major  engineering  feats  of  the  developing  spicule  matrix  is  the 
assembly  of  a  protein  scaffolding  network  that  readily  adapts  itself  to  the  emergence  of  ACC 
clusters  and  eventually  persists  within  an  intracrystalline  environment  as  the  ACC  phase 
transforms  into  crystalline  calcite.  Collectively,  our  findings  indicate  that  SM50  is  suitably 
adapted  for  a  major  role  in  this  process.  We  confirm  that  rSM50  spontaneously  oligomerizes  to 
form  amorphous,  heterogeneous  supramolecular  protein  complexes  that  can  form  films  and 
behave  in  a  relatively  mobile  fashion.  This  would  provide  a  means  for  quickly  assembling  a 
protein  matrix  with  fluid  or  labile  features  that  are  commensurate  with  those  of  the  ACC  phase 
itself.  Moreover,  the  lability  of  rSM50  assemblies  would  provide  an  adaptation  to  the  changing 
shape  and  dimension  of  spicules  as  they  undergo  developmental  elongation  and  maturation,  i.e., 
the  SM50-dominated  spicule  matrix  would  be  “plastic”  for  all  intents  and  purposes  and  thus  is 
perfectly  suited  for  embryonic  development  and  eventual  mineralization,  with  the  added  benefit 
of  providing  a  cushioning  or  compressive  phase  as  fracture -resistant  intracrystalline  components 
within  crystalline  calcite. 

Relevance  of  this  research  to  ARO:  Our  findings  provide  a  new  blueprint  for  designing  new 
composite  materials  that  incorporate  protein-inspired  philosophies.  First,  intrinsic  disorder  is  a 
primary  component  for  designing  “reactive”  polymers  that  can  form  assemblies  that  will  interact 
with  other  components,  such  as  inorganic  solids,  at  a  later  period  in  the  composite  assembly 
process.  Second,  the  incorporation  of  amyloid-like  sequence  motifs  can  create  specific  docking 
sites  for  protein  -  protein  or  polymer-polymer  interactions  that  can  stabilize  large  supramolecular 
assemblies.  The  degree  to  which  these  two  features  are  incorporated  into  future  polymers  or 
proteins  could  be  used  to  “tune”  the  assembly  process  and  the  molecular  features  of  the  resultant 
assemblies. 

Publications  generated  from  this  grant: 
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